Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods
نویسندگان
چکیده
BACKGROUND Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs) has great implication for therapeutics. METHODS In this study, a systematic attempt has been made to predict viral-host PPIs by integrating different features, including domain-domain association, network topology and sequence information using viral-host PPIs from VirusMINT. The three well-known supervised machine learning methods, such as SVM, Naïve Bayes and Random Forest, which are commonly used in the prediction of PPIs, were employed to evaluate the performance measure based on five-fold cross validation techniques. RESULTS Out of 44 descriptors, best features were found to be domain-domain association and methionine, serine and valine amino acid composition of viral proteins. In this study, SVM-based method achieved better sensitivity of 67% over Naïve Bayes (37.49%) and Random Forest (55.66%). However the specificity of Naïve Bayes was the highest (99.52%) as compared with SVM (74%) and Random Forest (89.08%). Overall, the SVM and Random Forest achieved accuracy of 71% and 72.41%, respectively. The proposed SVM-based method was evaluated on blind dataset and attained a sensitivity of 64%, specificity of 83%, and accuracy of 74%. In addition, unknown potential targets of hepatitis B virus-human and hepatitis E virus-human PPIs have been predicted through proposed SVM model and validated by gene ontology enrichment analysis. Our proposed model shows that, hepatitis B virus "C protein" binds to membrane docking protein, while "X protein" and "P protein" interacts with cell-killing and metabolic process proteins, respectively. CONCLUSION The proposed method can predict large scale interspecies viral-human PPIs. The nature and function of unknown viral proteins (HBV and HEV), interacting partners of host protein were identified using optimised SVM model.
منابع مشابه
Prediction of Host,Virus Protein Protein Interactions
New infectious viruses appear regularly, established ones fail to be eradicated, posing significant challenges to public health. Our lack of understanding of the intimate relationship between viruses and their hosts makes it difficult to develop effective therapies. Protein-protein interactions (PPIs) are key players in the cell generally and viruses exploit them for their purposes. Considerabl...
متن کاملPrediction of Protein Sub-Mitochondria Locations Using Protein Interaction Networks
Background: Prediction of the protein localization is among the most important issues in the bioinformatics that is used for the prediction of the proteins in the cells and organelles such as mitochondria. In this study, several machine learning algorithms are applied for the prediction of the intracellular protein locations. These algorithms use the features extracted from pro...
متن کاملProtein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملRabies Infection: An Overview of Lyssavirus-Host Protein Interactions
Viruses are obligatory intracellular parasites that use cell proteins to take the control of the cell functions in order to accomplish their life cycle. Studying the viral-host interactions would increase our knowledge of the viral biology and mechanisms of pathogenesis. Studies on pathogenesis mechanisms of lyssaviruses, which are the causative agents of rabies, have revealed some important ho...
متن کاملPrediction of Compound- Protein Interactions with Machine Learning Methods
In silico prediction of compound-protein interactions from heterogeneous biological data is critical in the process of drug development. In this chapter the authors review several supervised machine learning methods to predict unknown compoundprotein interactions from chemical structure and genomic sequence information simultaneously. The authors review several kernel-based algorithms from two ...
متن کامل